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Abstract 

Bayesian inference has been used successfully for 
many problems where the aim is to infer the param- 
eters of a model of interest . In this paper we for- 
mulate the three dimensional reconstruction problem 
as the problem of inferring the parameters of a sur- 
face model from image data , and show how Bayesian 
methods can be used to estimate the parameters of this 
model given the image data. Thus we recover the three 
dimensional description of the scene. This approach 
also gives great flexibility. We can specify the geomet- 
rical properties of the model to suif our purpose , and 
can also use different models for how the surface re- 
flects the light incident upon it. In common with other 
Bayesian inference problems , the estimation method- 
ology requires that we can simulate the data that would 
have been recoded for any values of the model parame- 
ters. In this application this means that if we have im- 
age data we must be able to render the surface model . 
However it also means that we can infer the parame- 
ters of a model whose resolution can be chosen irre- 
spective of the resolution of the images, and may be 
super- resolved. We present results of the inference of 
surface models from simulated aerial photographs for 
the case of super- resolution , where many surface ele- 
ments project into a single pixel in the low-resolution 
images. 

1 Introduction 

Bayesian inference has proved to be the method of 
choice for many inference problems enabling accurate 
estimation of parameters of interest from noisy and 
incomplete data, and also providing estimates of the 
errors associated with the inferred parameters. The 
general approach is illustrated in figure 1. The fig- 
ure shows that synthetic observations of the model are 
made using a computer simulation of the observation 
process, and these are compared with the actual ob- 
servations. The error between the actual and the sim- 
ulated observations is used to adjust the parameters 
of the model, to minimize the errors. The estimation 
of the errors on the parameter estimates (or more gen- 
erally, the estimation of the covariance matrix of the 
parameters) means that the parameter estimates can 
be updated when more observations become available. 


In the application to the reconstruction of three 
dimensional surfaces, the model is the parameterized 
surface model and the measurement system is a com- 
puter model of the image formation process. 

The surface model is chosen to suit the desired ap- 
plication. For example it could be a CAD model if the 
data was images of machine parts, or a spline model if 
the data came from more general man-made objects. 
In the application we describe in this paper we are 
interested in recovering planetary surfaces, and hence 
we choose to describe the surface by a triangular mesh. 
This is a very standard model in computer graphics, 
but is little used in computer vision (a notable excep- 
tion is [?] (zisserman) ). There is also the flexibility 
to specify the density of the mesh, which need not be 
spatially constant. 

The computer model of the image formation pro- 
cess is the area of computer graphics known as render- 
ing. A model of how light is reflected from the surface 
is used, together with an abstraction of how the im- 
age is formed, to synthesize the image that would have 
been recorded from the current surface under the light- 
ing conditions and camera position and orientation. 
These light reflectance parameters are parameters of 
the surface model that are to be inferred together with 
the geometrical parameters. The theoretical exposi- 
tion in this paper is valid for any parameterized re- 
flection function. The results assume Lambertian re- 
flection. 

The formation of the synthetic image from the pa- 
rameterized model and the lighting and camera pa- 
rameters is discussed in more detail in section 3. Here 
we just note that current rendering technology is un- 
suitable for our purpose because it operates in image 
space y and so the image formed has artifacts due to 
the relative sizes of the projection of the surface el- 
ements onto the image plane and the pixels. These 
artifacts are noticeable along the edges of surface ele- 
ments. Also, if the projections of the surface elements 
in the pixel plane are very small, such as when we are 
trying to infer a high resolution mesh from many low 
resolution images, these artifacts dominate the ‘stan- 
dard’ rendering process. The Tenderer that is required* 
for this purpose is one that operates in object space , 



Figure 1: An outline of the Bayesian approach to sur- 
face reconstruction 

and in section 3 we briefly describe such a system. An 
object-space Tenderer can also be made to compute 
the derivatives of the pixel values with respect to the 
surface model parameters. This is a crucial part of the 
inference process, and is described in detail in section 
3.1. 

One of the major goals of this paper is to develop 
Bayesian inference approach for 3D super-resolved 
surface reconstruction when the resolution of inferred 
surface mesh is higher than the spatial resolution of 
input images. This also allows to achieve the image 
super-resolution: synthetically produced images of the 
super- resolved surface model can be at higher resolu- 
tion that that of input data images. 

1.1 Previous Work 

Most previous work in the area of the estimation of 
three dimensional surfaces from image data has used 
one of two main approaches, shape from shading (more 
generally, ‘shape from X’) and shape from stereo. 

In shape from shading [?] the image gradients are 
used, and, using the assumption of orthographic pro- 
jection, are related to the surface derivatives. Using 
assume^ boundary conditions the surface derivatives 
can be integrated to produce an estimate of the surface 
heights (more strictly, the distance from the camera to 
the points on the surface). 

Shape from stereo [?] uses correspondence matches 
between features in the images to give the disparity 
between these features. Finding these correspondence 
matches is aided by using the epipolar constraint [?]. 
If the camera geometry is known then the disparity 
can be directly related to the distance of the feature 
from the camera, and the feature can be located in 
space. The discrete points corresponding to the fea- 
tures matched in the images are then joined to form a 
representation of the surface. 

A drawback of shape from stereo is that the density 
of points in the recovered surface is unknown a-priori, 
and is dependent on the number and density of fea- 


tures found in the images. A feature detector giving 
few features gives a very coarse surface representation; 
one that gives very many is likely to produce features 
that are hard to match between images. 

Shape from shading has the drawback that the spa- 
tial density of the recovered surface is fixed at the 
image density. It is also difficult to apply if the re- 
flectance properties are spatially varying. Both of 
these approaches have difficulty incorporating new ob- 
servations of the surface that become available after 
the initial estimate is made. 

Bayesian approach was used for image super- 
resolution in the number of papers beginning from [6]. 
In the previous work [7] input images were taken from 
roughly the same direction under the similar lighten- 
ing conditions. In this work the surface model is essen- 
tially represented as 2D texture map. In our case of 
fully 3D surface reconstruction this restriction is lifted: 
low resolution input images as well as high-resolution 
output images can correspond to a very different val- 
ues of registration parameters. 

2 A Bayesian Framework 

In this paper we consider the following surface 
model. The geometry is represented by a triangular 
mesh. We consider the case of Lambertian surfaces 
and store surface reflection properties in the vector 
of albedo values associated with the vertices of the 
triangular mesh. We will assume known the camera 
parameters and the parameters of the lighting. The 
estimation of these parameters will be considered in 
a forthcoming paper using the same Bayesian frame- 
work. 

Thus we represent the surface model by the pair of 
vectors [h y p]. The components of these vectors cor- 
respond to the height and albedo values defined on a 
regular grid of points 

[hp] = {(*« ,Pt), i=t(qx + py)} q,p = 0,1,... 

( 1 ) 

where l elementary grid length, x, y are an orthonor- 
mal pair of unit vectors in the (x,y) plane and i in- 
dexes the position in the grid. The pair of vectors 
of heights and albedos represent a full vector for the 
surface model 

u = [h p). (2) 

To estimate the values of h y p from image data, we 
apply Bayes theorem which gives 

p(h y p\Ii ... If ) oc p(/i ...I F \h,p)p(h y p) 

where // (/ = 1, . . . , F) is the image data. This states 
that the posterior distribution of the heights and the 






albedos is proportional to the likelihood - the prob- 
ability of observing the data given the heights and 
albedos - multiplied by the prior distribution on the 
heights and albedos. 

The prior distribution is assumed to be Gaussian 
p{h,p) a exp u T j , (3) 

r-i = f QK o 

[ 0 Q/al J’ 

where the vector of the surface model parameters u is 
defined in (2). The inverse covariance matrix is con- 
structed to enforce the smoothing constraint on local 
variations of heights and albedos. We penalize the 
integral over the surface of the square of the surface 
curvature c(x, y) = h\ x + + 2 h 2 xy , and similarly for 

albedos. We approximate the partial derivatives in 
c(x, y) using finite differences of the height (albedo] 
values. Then coefficients of Q form a 5x5 template f 
and result from summing c(x, y) over the surface. 

Ql^ m = r,.„, q,P — —2, . . . , 2. (4) 

For brevity we do not provide here an explicit values 
of coefficients IY P . Two metaparameters an and <j p in 
equation (3) control the expected values of the surface- 
averaged curvatures for heights and albedos. 

This prior is placed directly over the height vari- 
ables, h, but albedos are only defined over the range 
[0 — 1]. Because of this we put in equation (3) the 
Gaussian prior for the albedos over transformed vari- 
ables, where 


p -t log(p'/(l - p')), u = M (5) 

In the vector of model parameters u values of p are 
replaced by values of p'. 

For the likelihood we make the usual assumption 
that the differences between the observed data and the 
data synthesized from the model have a zero mean, 
Gaussian distribution, and also assume that the im- 
ages If comprising the data are conditionally inde- 
pendent. This gives 


p(I t .. .If\Kp) oc exp 


( E ,jrfp-ifp(^p )) 2 

{ 2<r? 


) 


where I/ P (h,p) denotes the pixel intensities in the im- 
age / synthesized from the model, a\ is the noise vari- 
ance and the summation is over the pixels (p) and over 
all images (/) used for the inference. 


Consider the negative log-posterior. 


L{h ,p) « 


IZ/,p(7 / 1 


AKp)? 


where the model parameters vector is defined in (2). 
This is a nonlinear function of h } p and the MAP esti- 
mate is that value of h,p which minimizes L{h y p ). 

In the case of images with no shadows or visible 
occlusions which we consider here, the log-posterior 
is in general unimodal and gradient methods can be 
applied for minimizing L(h, p). We linearize I{h,p) 
about the current estimate, ho,po 


I(h,p) = /(h 0 ,po)+D (u-uo), D = 

(7) 

where D is the matrix of derivatives evaluated at 
ho,Po- Then minimization of L{h,p) is replaced by 
minimization of the quadratic form: 


p difp 1 
dz x ’ dp[ j 



Here A is the Hessian matrix of the quadratic form 
and vector b is a gradient of a likelihood L computed 
at current estimate. We search for the minimum in x 
using a conjugate-gradient method. At the minimum 
we update the current estimate, uj = uo + x, recom- 
pute / and D, and repeat the minimization procedure 
iteratively until 1 the current estimate u* approaches 
a global minimum of L(h,p). 

Thus to find the MAP estimate requires that we can 
render the image and compute the derivatives for any 
values of the surface model parameters. We discuss 
this computation in some detail in the next sections. 
Here it is sufficient to note that while forming / us- 
ing only object space computation (see section 3) is 
computationally expensive, we can compute D at the 
same time for little additional computation. Also the 
derivative matrix is sparse with the number of nonzero 
entries a few times the number of model parameters. 
This makes the process described above a practical 
one. Convergence is also accelerated by using a multi- 
grid approach. 

At convergence we compute a new inverse covari- 
ance matrix, (E -1 )' = E” 1 + DD r /(jg. This is then 
used as a prior inverse covariance matrix when new 
image data of the same surface is obtained, enabling 



A 



Figure 2: Geometry of the triangular facet, illumina- 
tion direction and viewing direction 

a recursive update and integration of data recorded 
at different times. The posterior inverse covariance 
matrix gives information about the uncertainty of the 
estimated surface. 

3 Formation of the image and the 
derivative matrix. 

The task of forming an image, /, given a surface 
description, h,p, and camera and illumination pa- 
rameters is the area of computer graphics known as 
rendering[2]. Most current rendering technology is fo- 
cused on producing images which are visually appeal- 
ing, and producing them very quickly. As discussed 
in the introduction, this results in the use of image- 
space algorithms, with the fundamental assumption 
that each triangle making up the surface, when pro- 
jected onto the image plane, is much larger than a 
pixel. This makes reasonable the assumption that 
any given pixels receives light from only one triangle, 
but does produce images with artifacts at the triangle 
edges. This approach also produces inaccurate images 
if the triangles project into areas much smaller than 
a pixel on the image plane, as the pixel will then be 
colored with a value coming from just one of the tri- 
angles. 

Clearly this approach is not suitable for high- 
resolution 3D surface reconstruction from multiple im- 
ages. The triangles in a high- resolution surface may 
project onto an area much smaller than a single pixel 
in the image plane (sub-pixel resolution). Therefore, 
as discussed in the introduction, for our system we 
implemented a Tenderer for triangular meshes which 
performs ail computation in object space. At present 
we neglect the blurring effect due to diffraction and 


due to the role of pixel boundaries in the CCD array. 
Then the light from a triangle as it is projected into a 
pixel contributes to the brightness of the pixel with a 
weight factor proportional to the fraction of the area 
of the triangle which projects into that pixel. This 
produces perfectly anti-aliased images and allows an 
image of any resolution to be produced from a mesh 
of arbitrary density, as required when the system per- 
forming the surface inference may have no control over 
the image data gathering. 

Our Tenderer computes brightness I p of a pixel p in 
the image as a sum of contributions from individual 
surface triangles A whose projections into the image 
plane overlap, at least partially, with the pixel p. 

4 = £/£* a. 0 ) 

A 

Here is a radiation flux reflected from the triangu- 
lar facet A and received by the camera, and /£ is the 
fraction of the flux that falls onto a given pixel p in 
the image plane. In the case of Lambertian surfaces 
and single spectral band is given by the expression 

$a = pE(a a ) cos a v cos* 0 AD, (10) 

E(a') = pA{F cos o' +1°). 

AD = 5/d 2 . 

Here p is an average albedo of the triangular facet. 
Orientation angles a* and a v are defined in figure 2. 
E(a s ) is the total radiation flux incident on the trian- 
gular facet with area A. This flux is modeled as a sum 
of two terms. The first term corresponds to direct ra- 
diation with intensity l 9 from the light source at infin- 
ity (commonly the sun). The second term corresponds 
to ambient light with intensity X a . The parameter 9 
in equation. (10) is the angle between the camera axis 
and the viewing direction (the vector from the surface 
to the camera); k is the lens falloff factor. Afi in (10) 
is the spatial angle subtended by the camera which 
is determined by the area of the lens 5 and the dis- 
tance d from the centroid of the triangular facet to the 
camera. 

We identify the triangular facet A by the set of 3 
indices (io,ii>ia) from the vector of heights (1) that 
determines the vertices of the triangle in a counter- 
clockwise direction (see figure 2). In the r.h.s of equa- 
tion (10) we have omitted for brevity those indices 
from all the quantities associated with individual tri- 
angles. 'the average value of albedo for triangle in (10) 
is computed based on the components of the albedo 
vector p corresponding to the triangle indices 

PA = Plo.li.li = ^(pio + Ph + Pli)' (11) 



We note that using average albedo (11) in the ex- 
pression for <$a is an approximation which is justified 
when the albedo values vary smoothly between the 
neighboring vertices of a grid. 

The area A of the triangle and the orientation an- 
gles in (10) earn be calculated in terms of the vertices 
of the triangle Pi (see figure 2) as follows: 


• Compute the area fractions / A for all pix- 
els p that are overlapped by the projection 
of the triangle by finding the geometrical in- 
tersection of A with those pixels . 

• Update the pixel intensities I p based on 
equation (9) with the contribution from the 
current A. 


h z* =cosa 1 2 * * 5 , n • i* = cosa w , (12) 


Here n is a unit normal to the triangular facet and 
vectors of the edges of the triangle v*j are shown in 
figure 2. 

We use a standard pinhole camera model with no 
distortion in which coordinates of a 3D world point 
P = (x,y,z) axe first rotated with the rotation ma- 
trix R and translated by the vector T into camera 
coordinates, yielding P c = {SciVa *c) 

P C = RP + T. (13) 

(R and T are expressed in terms of the camera regis- 
tration parameters [?]; we do not give them explicitly 
here). After the 3D transformation given in (13) point 
P c in the camera coordinate system is transformed us- 
ing a perspective projection into the 2D image point 
P = (x,y) using a focal length / and aspect ratio a. 


3.1 Computation of the derivative ma- 
trix. 

The inference of the surface model parameters de- 
pends on the ability to compute the derivatives of the 
observations with respect to the model parameters. In 
this section we obtain the derivatives of the pixel in- 
tensities with respect to the surface heights and albe- 
dos, and make some comments about efficient imple- 
mentation. 

According to equation (9), the intensity I p of a pixel 
p depends on the subset of the surface parameters, 
heights and albedos, that are associated with the tri- 
angles whose projections overlap the pixel area. It 
is seen from Eq. (10) that when the surface heights 
are fixed the radiation flux from a given triangle $a 
depends linearly on the average albedos of the trian- 
gle, pa- Then using Eqs. (9) and (5) the derivatives 
of a pixel intensity I p with respect to logarithmically 
transformed albedo values p' A can be written in the 
following form: 


r*i_ / 

' a 0 ' 


’ x c ' 

\Jt 

i 

i 

0 1 


. Vc . 


We use 2D image projections of the triangular ver- 
tices Pi to compute the area fraction factors / A for 
surface triangles (cf. Eq. (9)) 

p = ^polygon (15) 

' Aa 

Here Aa is the area of the projected triangle on the 
image plane and Ap O iyg 0n is the area of the polygon 
resulting from the intersection of the projected trian- 
gle and boundary of the pixel p (see figure 3). 

Thus, our renderering algorithm does the following: 

1. Transform the coordinates of the vertices of the 
surface model, Pi, into 2D image coordinates. 
(Maintaining copies of the points in both the 
world and image coordinate systems.) 

2. For each triangle A in the surface model. 

• Compute a total radiation flux a for every 

triangle, according to equation (10), using 

world coordinates. 


-'»>*!£• (16) 

where the form of the coefficient <9 $a/<9pa immedi- 
ately follows from Eq. (10). Note also that the deriva- 
tive value in (16) is scaled with the area fraction factor 
for a given triangle. 

3.1.1 Derivatives with respect to the heights 
of the vertices of the triangles 

In our object-space Tenderer, which is based on pixel- 
triangle geometrical intersection in the image plane, 
the pixel intensity derivatives with respect to surface 
heights have two distinct contributions 



Variation of the surface height h\ gives rise to varia- 
tions in the normals of the triangles associated with 
this height (in a general triangular mesh, on average 6 
triangles are associated with each height) and this pro- 
duces the derivatives of the total radiations fluxes $a 




Figure 3: The intersection of the projection of a trian- 
gular surface element (io,ix>ia) onto the pixel plane 
with the pixel boundaries. Bold lines corresponds to 
the edges of the polygon resulting from the intersec- 
tion. Dashed lines correspond to the new positions of 
the triangle edges when point Pj 0 is displaced on JP 

to the camera from those triangles. This is the first 
term in equation (17). Also, height variation gives rise 
to the displacement of the corresponding point which 
is the projection of this vertex on the image plane. 
This results in changes to the areas of the triangles 
and polygons with edges containing this point (see fig- 
ure 3). This produces the derivatives of the fractions 
/£, the second term in equation 17. 

When the triangle is completely inside the pixel its 
area fraction /£ = 1 and according to (17) its contri- 
bution to the pixel intensity derivative equals to the 
derivative of the corresponding radiation flux, d$fdz\. 
The flux derivative can be computed directly from the 
coordinates of the triangle vertices and the camera 
position using Eqs. (10) and (12). For the surface tri- 
angle with vertices (P^P^, P|J the flux derivative 
with respect to the z component of the vertex Pj 0 
equals 

^ = ( 18 ) 

where 

g = X 9 (i v cosa, 4- z # cosa„ - ncosa, cosa„) + Z a z v 

and z is a unit normal in the vertical direction. 

When the triangle is projected into more than one 
pixel than the height derivatives of the projected area 
fraction in (17) have to be computed for every pixel 


intersecting with the triangle. This can be done using 
the following chain rule arguments. 

As mentioned above, when the z component of the 
vertex Pj 0 in the 3D world coordinate system is vary- 
ing by 5z, the corresponding 2D point P\ Q in the image 
projection plane is displaced by 5 P as shown in fig- 
ure 3. The corresponding point displacement deriva- 
tive equals: 

dP = _l(a/R l3 +xR 33 \ 
dz z c \ //? 23 +i//?33 ) ' Uy; 

Here we have dropped the vertex index; R tJ are the 
components of the rotation matrix R in (13) and z c is 
the z component of the point P in the sensor coordi- 
nate system, given by equation (13). 

Displacement 6 P of the triangle vertex P gives rise 
to the change 6A& in the area of the projected triangle 
and also the change 6A p0 |yg 0n in the polygon area. It 
then follows from equation(15) that 

d-^A _ J_ { ^polygon _ fP dA A \ dP io 

dz lo A a \ dP io /A c>P io J dz io ■ 1 U> 

Here the point displacement derivative dP\ 0 /z\ 0 is 
given in (19). 

Thus, the task of computing the derivative of the 
area fraction (20) is reduced to the computation of 
9^/aPfe and 8/1^1 /9P,„. Note that the inter- 

section of a triangle and a pixel for a rectangular pixel 
boundary can, in general, be a polygon with 3, 4, 5 or 
6 edges with various possible forms. However the algo- 
rithm for computing the polygon area derivatives that 
we have developed is general, and does not depend on 
a particular polygon configuration. The main idea of 
the algorithm can be described as follows. Consider, 
as an example, the polygon shown in figure 3 which 
is a part of the projected surface triangle with indices 
io,ii,ia. We are interested in the derivative of the 
polygon area with respect to the point that con- 
nects two edges of the projected triangle, (P^P^) 
and (Pio.Pii). These triangular edges contain seg- 
ments (I, J) and (K, L) that are sides of the corre- 
sponding polygon. It can be seen from figure 3 that 
when the point Pj 0 is displaced by <5P| 0 the change in 
the polygon area is given by the sum of two terms 

^polygon = 4" <$Ak,L 

These terms are equal to the areas spanned by the 
two corresponding segments taken with appropriate 
signs. Therefore the polygon area derivative with the 
respect to the triangle vertex P^ is represented as a 




sum of the two “segment area” derivatives for the 2 
segments adjacent to a given vertex. Using straight- 
forward geometrical arguments one can calculate the 
areas and <5 Ak,l t0 first order in the displace- 

ment <5Pi 0 . Then the polygon area derivative can be 
expressed directly in terms of the triangle vertices and 
the endpoints of the polygon segments: I, J, K and L 
(cf. figure 3). 

The details of the polygon area derivative compu- 
tation will be presented elsewhere. Here we provide a 
result for the simplest particular case, i.e. the deriva- 
tive of the area of the projected triangle 


dA A 

dPi 0 




a — 


0 1 

-1 0 


( 21 ) 


The unit antisymetrix matrix a performs a -ir/2 ro- 
tation in the image plane. 

We note that the computation of the derivative ma- 
trix and the surface rendering essentially involves the 
same set of variables (triangle and polygon vertices, 
areas, etc). Therefore both computations can be done 
at the same time. 


4 Adaptiveness and Super-resolution 
aspects 

The correct choice of the smoothness prior (3) is 
very substantial for inferring the surfaces that have 
regions with high curvature (edges) . It is also of espe- 
cial importance in the case of surface super-resolution 
where the spatial resolution of individual pixels is 
greater than the size of a surface triangle. 

Clearly that the values of metaparameters a h and 
ct p should be controlled by the relative sizes of surface 
triangles t and spatial resolution of individual pixels 
p. One needs at least 2 low-resolution images to infer 
heights and albedos at the same time. In this case <7/* 
and <jp should be chosen in such a way to ensure the 
smoothness of a surface patches on a scale ~ p. For 
larger number of low-resolution images the inferred 
surface may have a smoothness scale smaller than p. 
This corresponds to a smaller values of and a p . 

We note however that the smoothness prior remains 
very important even in the “over-constrained” case 
where the number of low-resolution images is relatively 
large and the total number of pixels in all images per 
hight and albedo value is ~ 1. The reason is that 
the spatial structure of the derivative values can be 
very irregular in this regime and smoothness prior es- 
sentially plays a role of regularizer. Indeed, one can 
show based on the analysis from the previous section 
that the magnitude of the derivatives \dl p /dz\\ can be 
much larger for the surface vertices whose triangles in- 
tersect the pixel boundary than for the vertices that 


are projected fully inside of the pixel along with all tri- 
angles surrounding them. When the number of trian- 
gles per pixel increases (typically > 10) this can give 
rise to a strong spatial modulation of the components 
bi of the gradient vector and also matrix elements of 
DD r (8). In general, pixel boundaries from different 
images are not allined and therefore even for several 
images the spatial pattern of bi DD r and can be very 
irregular. 

Regularization is achieved when correlations in- 
duced by the smoothness prior, S’ 1 , would be of the 
same order as correlations induced by observations in 
the matrix DD r . Therefore every time we compute 
derivative matrix D we readjust the metaparameter 
c 7^ so that 

r 0 ,o o\la\ « IV h (DD r ) /N <x( 2 , (22) 

where the traces of DD r are taken with respect to 
height and albedo variables; N is the number of ver- 
tices in a grid, i is the size of individual triangles 
(equation 1). Value of a p is readjusted in a similar 
way. 

Finally to achieve the local adaptiveness of the prior 
we place the curvature penalty on the deviation from 
the current surface estimate, u - u*, but not on the 
estimate itself. In this case the second term in the 
expression the the gradient b in (8) should be omitted. 

5 Results 

As a test example we used a triangulated surface 
which heights correspond to DEM of the Dackwater 
region (Nevada). We prepared surface albedos syn- 
thetically to fit the existing Landsat image data of the 
same region. The surface is of dimension 297 x 297 
heights and the same number of albedos. Sixteen 
low-resolution images of the surface were produced us- 
ing simple perspective sensor model (14), with differ- 
ing lighting and camera orientations. Each images is 
128 x 128 pixels. We used these synthetic images as 
the input data images I. Figure 4 (left) shows a por- 
tion of one of the input images of the size (40x40) 
pixels. 

Starting from a mesh with all zero heights and all 
albedos set to 0.5, the conjugate gradient scheme de- 
scribed above was used to infer the surface shown in 
figure. The surface is of the same dimension as the 
original surface. Not that this is a dense triangulation 
- when projected into the pixel grid of figure 4 many 
triangles fall into one pixel. Thus we infer a super- 
resolved surface - a pixel lying on a mountain ridges 
does not imply a planar region in the inferred surface, 
rather, we infer a surface where highly curved regions 




Figure 4: Left: a region of low-resolution input im- 
age, 96x74 pixels (left). Right: high-resolution recon- 
structed image, 960x740 pixels 


may project into a single pixel. Figure 4 (right) shows 
a high-resolution image synthesized from the same re- 
gion of the inferred surface as that corresponding to 
original data image at the left. This image was ren- 
dered at 10 times the resolution of the original data 
image. Comparison of left and right images figure 4 
highlights the super-resolution aspect of our approach. 

Because we know the original surface the error 
maps can be computed for both heights and albedos 
to judge the quality of inference. Note the vertical 
scales compared with figure . The reconstruction is 
accurate, with most errors being in the regions of high 
curvature. 

6 Conclusions and future extensions 

We have developed a very general framework for the 
inference of general surface geometry and reflectance 
models from image data, where the model choice is de- 
termined by the physical properties of the surface we 
wish to infer. We have demonstrated that for the case 
of a triangulated surface and Lambertian reflectance 
the parameters of a surface model, namely the heights 
and albedos, cam be inferred from a set of image data. 
We have developed a framework that allows easy in- 
clusion of future data observed from the same sur- 
face, and easy incorporation of data from other sensing 
modalities. 

In this paper we assumed the registration parame- 
ters of input data images to be know in advance. In 
principle one can use the Bayesian approach devel- 


oped above to infer the registration parameters of the 
data images along with surface heights and albedos. 
Such inference will include as an essential element the 
derivatives of the intensities of synthetic images with 
respect to registration parameters. 

Another limitation of the current work is related 
to the absence of shadows and visible occlusions in 
input images. Future developments also include the 
addition of the ability to compute correctly both im- 
age and its derivatives when this limitation is lifted. 
Here we only note that the derivatives in the presence 
of shadows/occlusions are nonlocal as the points lay- 
ing on the surface far apart can become correlated. 
This nonlocal derivatives axe very informative as to 
the shape of the surface. 

Among the other extensions are more realistic re- 
flection functions, blurring and modeling of different 
surface topologies. Limits to the accuracy of the 
superresolved surface reconstruction will also be ex- 
plored. 
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